Compiler Reduction of Invalidation Traffic in Virtual Shared Memory Systems

نویسندگان

  • Michael F. P. O'Boyle
  • Rupert W. Ford
  • Andy Nisbet
چکیده

This paper presents new compiler analysis for the elimination of invalidation traac in virtual shared memory, using a hybrid distributed invalidation coherence scheme. The invalidation and acknowledgement messages are removed; this reduces both network invalidation traac and the latency of a write fault. It aggressively exploits the SPMD execution model and uses array section analysis to accurately determine only those instances when invalidation is necessary, thus avoiding the additional read misses of previous schemes. Equations determining precisely what data should be invalidated are presented and translated into a form amenable to compiler analysis. Preliminary experimental results on a 30 node prototype architecture demonstrate the performance attainable using this scheme.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A compiler algorithm to reduce invalidation latency in virtual shared memory systems

This paper presents a new compiler algorithm to eliminate invalidation traffic in virtual shared memory using a hybrid distributed invalidation scheme. It aggressively exploits static scheduling and data layout to accurately determine only those instances when invalidation is necessary, thus avoiding the additional read misses of previous schemes. Equations determining precisely what data shoul...

متن کامل

Fast & Cost Effective Cache Invalidation in DSM

Most distributed shared memory systems use point-topoint networks in conjunction with directory-based cache coherence protocols. Cache invalidation transaction generates a number of unicast invalidation messages and as many acknowledgment messages. This results in heavy network traffic, high latency, and high occupancy at home nodes. This paper introduces a fast cache invalidation method, calle...

متن کامل

User-Level VSM Optimization and its Application

This paper describes user-level optimisations for virtual shared memory (VSM) systems and demonstrates performance improvements for three scientiic kernel codes written in Fortran-S and running on a 30 node prototype distributed memory architecture. These optimisations can be applied to all consistency models and directory schemes, whether in hardware or software, which employ an invalidation b...

متن کامل

Processor-Directed Cache Coherence Mechanism – A Performance Study

Cache coherent multiprocessor architecture is widely used in the recent multi-core systems, embedded systems and massively parallel processors. With the ever increasing performance gap between processor and memory, there is a requirement for an optimal cache coherence mechanism in a cache coherent multiprocessor. The conventional directory based cache coherence scheme used in large scale multip...

متن کامل

Fine Grain Synchronisation in VSM Architectures

This paper presents a new scheme to replace course grain barriers with ne grain synchronisation in virtual shared memory systems. Traditionally, shared memory programming models separate data access from synchronisation. In our scheme synchronisation between both writes and their subsequent reads, and reads and their following writes, is achieved through the coherence tags associated with each ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996